home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Shareware Grab Bag
/
Shareware Grab Bag.iso
/
090
/
toolfix.arc
/
CONST.ACC
< prev
next >
Wrap
Text File
|
1987-03-03
|
13KB
|
239 lines
*******************************************************************************
CONST.ACC
Version 1
November 6, 1985
by Randy Forgaard
CompuServe 70307,521
This file presents some hints for choosing values for the special constants
required by the Turbo Access portion of the Turbo Pascal implementation of the
Turbo Database Toolbox (formerly the Turbo Toolbox), versions 1.0 and 1.1. It
applies to all operating systems and computers for which the Database Toolbox
is available. There are no hard facts in this file that are not also in the
Toolbox manual, but the hints below may help if your program is going haywire
and you suspect that the values of the Turbo Access constants may be the source
of the problem. These hints might also help increase the speed of Turbo Access
as used by your program.
*******************************************************************************
The Turbo Access portion of the Turbo Database Toolbox asks the programmer to
declare 6 integer constants in the Turbo Pascal program, namely MaxDataRecSize,
MaxKeyLen, PageSize, Order, PageStackSize, and MaxHeight, prior to the {$I}
directives that bring in the Toolbox source files. Bad values for these
constants can result in anything from poor performance to mysterious program
crashes. Below, these constants are discussed individually.
MaxDataRecSize
--------------
MaxDataRecSize is the size of the largest record you will be storing in any
DataFile. That is, if you are going to have two kinds of DataFiles, one to
store records of type R1, and one to store records of type R2, and R2 records
occupy more storage than R1 records, MaxDataRecSize should be set to the number
of bytes occupied by records of type R2. If MaxDataRecSize is larger than
necessary, Turbo Access will still work properly, but some memory will be
wasted.
Suppose that your program is to have two kinds of DataFiles, some of which hold
records of type Person, and some of which hold records of type Loan, where
these record types are defined as follows:
type type
Person = record Loan = record
name: String[30]; company: String[40];
age: Byte; secured: Boolean;
married: Boolean months: Integer;
end; interestRate, payment: Real
end;
MaxDataRecSize must be set to the size of Person or Loan, whichever is larger.
It is dangerous to compute MaxDataRecSize by hand. For example, in computing
the size of Person, one might say that Person occupies 30 + 1 + 1 = 32 bytes.
But the actual answer is 33 bytes (the String[30] type occupies 31 bytes, due
to the additional byte for the length). In computing the size of Loan, one
might forget that there are actually two Reals, even though they are listed on
one line. Furthermore, the number of bytes occupied by a Real could be 6, 8,
or 10 bytes, depending on whether regular Turbo, Turbo-87, or Turbo BCD is
being used.
To be sure to pick an appropriate value for MaxDataRecSize, create a little
Turbo program that includes the type definitions for the records you will be
storing in DataFiles, and write out the size of each of those records using
Turbo's built-in SizeOf function, so that you can choose the maximum of those
values. For the above record types, one could write a program like this:
program DisplaySizes;
<...type definitions of Person and Loan go here...>
begin
writeln('Size of Person = ', SizeOf(Person));
writeln('Size of Loan = ', SizeOf(Loan))
end.
Compile and run this small program under the same version of Turbo that you
will be using for your real program. Running this program under regular Turbo
gives us 33 bytes for the size of Person, and 56 bytes for the size of Loan.
Under Turbo-87 we get 33 and 60 bytes, respectively, and Turbo BCD yields 33
and 64. Assuming that we are using regular Turbo, we would set the value of
MaxDataRecSize to 56 (the larger of 33 and 56). To be safe, in case we decide
to use Turbo BCD in the future and forget to change MaxDataRecSize accordingly,
we might set MaxDataRecSize to 64.
It is very important that MaxDataRecSize be the correct value, or larger. A
value for MaxDataRecSize that is too small results in mysterious and erratic
program behavior, and the cause of the problem can be very difficult to find.
MaxKeyLen
---------
MaxKeyLen is the length of the longest keys you will be using for any of your
IndexFiles. In Turbo Access, all keys are strings, and MaxKeyLen is a string
length. For example, if your longest keys are 25 characters long, you would
use String[25] as the type of any variables that are to hold those keys, and
you would set MaxKeyLen to 25.
Note that this is a different idea from MaxDataRecSize. The obvious difference
is that MaxKeyLen refers to the keys in the IndexFiles, and MaxDataRecSize
refers to the data records in the DataFiles. The subtle difference is that
MaxKeyLen is the length of the string, _not_ the number of bytes such a string
would occupy. If your keys are up to 25 characters long, set MaxKeyLen to 25.
However, if a String[25] is part of a data record, the String[25] would have to
be counted as 26 bytes for the purposes of computing MaxDataRecSize.
PageSize
--------
PageSize is the number of key entries in each page of every IndexFile used by
your program. It must be an even number between 4 and 254, inclusive. Beyond
this restriction, it is difficult to choose a "correct" value for PageSize; it
is a performance/space trade-off, and a non-linear one at that. If you choose
a value for PageSize that is too small, Turbo Access will have to traverse many
IndexFile pages during a search, which usually means of a lot of disk I/O. A
value for PageSize that is too large will use up a lot of memory without
yielding a proportionately larger increase in execution speed.
The issue is further complicated in that a single value for PageSize must be
chosen that will be used for all IndexFiles in your program, even though
different PageSize values will be optimal for IndexFiles with different maximum
key lengths. For the purpose of choosing a PageSize value, identify the
IndexFile whose access speed is most critical to your application. In what
follows, let K denote the maximum key length of that IndexFile.
Some tests with Turbo Access, on both a hard disk and a floppy, seem to suggest
the following rule of thumb for achieving good time/space efficiency: Choose
PageSize so that the product of PageSize and K is close to 2000. Smaller
values for PageSize can cause Turbo Access to run measurably slower. Larger
values for PageSize will use up valuable memory space that could be more
profitably used by increasing PageStackSize (see below), rather than PageSize.
Order
-----
The value of Order is simply half of the value of PageSize. Since PageSize is
always even, Order will be an integral value. The name "order," in the
terminology of trees, refers to the number of children each node of the tree
has. A binary tree is order 2. In B+ trees, the number of children varies,
but is always at least half of the PageSize. Hence the name Order for half of
PageSize.
PageStackSize
-------------
PageStackSize is the number Pages that Turbo Access keeps internally, as a
cache, so that it does not need to read pages from disk as often. The value of
PageStackSize must be greater than or equal to 3. The Toolbox manual notes
that the "minimum reasonable value for PageStackSize is the value of MaxHeight"
(see below). Indeed, empirically, extraordinary performance degradation does
seem to result if PageStackSize is less than MaxHeight.
Like PageSize, the choice of a value for PageStackSize is a performance/space
trade-off. Larger values for PageStackSize will allow Turbo Access to run
faster, but will also use more of the memory that the Turbo compiler sets aside
for global variables in your program. After PageSize has been chosen as per
above, choose the largest possible value for PageStackSize that will still
permit the rest of your program to have the memory for global variables that it
needs. This may be a trial-and-error process, since large values for
PageStackSize may cause a "Memory Overflow" error from Turbo while compiling
global variable declarations in either the Turbo Access source files or in your
own code.
MaxHeight
---------
MaxHeight is the maximum height that the B+ tree in any IndexFile can attain.
It is a function of the PageSize and the maximum number of keys (including
duplicates, if permitted) that you will allow in any of the IndexFiles your
program uses. You can find the correct value for MaxHeight by running the
following Turbo program, which implements the MaxHeight formula given in the
Toolbox manual:
program FindMaxHeight;
var
PageSize, MaxHeight: Integer;
MaxKeyCount: Real;
begin
write('PageSize: ');
readln(PageSize);
write('Maximum number of keys that can be stored in any IndexFile: ');
readln(MaxKeyCount);
MaxHeight := Round(Ln(MaxKeyCount) / Ln(PageSize * 0.5)) + 1;
writeln('MaxHeight = ', MaxHeight)
end.
Increasing MaxHeight by one only adds 4 bytes to each IndexFile variable your
program maintains. Thus, just in case you underestimate the maximum number of
keys that will be stored in any IndexFile, you might want to add one or two
onto the value of MaxHeight computed by the above program.
General Notes
------- -----
After selecting values for the above constants, compiling your program, and
creating database files with those constant values in force, it is possible
that you might want to change those constant values. With MaxDataRecSize,
MaxKeyLen, PageStackSize, and MaxHeight, you can change the constant (subject
to the constraints stated in the paragraphs above), recompile your program, and
still be able to read IndexFiles that were created under the old constant
values.
After changing PageSize and Order, however, you will no longer be able to read
IndexFiles created with the old values for PageSize and Order. In this case,
you will have to rebuild the IndexFiles with the new values for PageSize and
Order in effect.
To rebuild IndexFiles, you can always read an old DataFile, "d," by doing a
GetRec(d,i,r) for each data record number "i" from 1 to FileLen(d)-1, and
bypass the IndexFiles entirely. However, you will need to make sure that you
have reserved the first two bytes of each data record, so that you can tell
whether it has been deleted when reading it with GetRec. When rebuilding
IndexFiles, you will want to skip any data records marked as deleted. See the
"Reuse of Deleted Data Records" section of the Toolbox manual for details.
Finally, some notes about the versions of Database Toolbox and their
compatibility with the versions of Turbo:
(1) If you are using DOS Turbo 3.0, make sure you have Turbo 3.01A or later
(version 3.00B does not work with Turbo Access).
(2) If you are using DOS Turbo 3.0, make sure you use ACCESS3.BOX in place of
ACCESS.BOX. ACCESS3.BOX is included on the Turbo 3.0 disk (not the
Database Toolbox disk).
(3) If you have version 1.0 of the Database Toolbox, see the file TBXFIX in
DL 1 of the Borland SIG on CompuServe, to upgrade your Toolbox to
version 1.1 (fixes some bugs).
If you are not running under DOS, or if you are using Turbo 2.0, only (3) above
applies to you.
If you have any further questions about Turbo Access, or would like to discuss
some aspect of it, please feel free to ask on CompuServe's Borland SIG, and/or
to send a message to me personally on the Borland SIG or via EasyPlex.